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FIELD OF THE INVENTION 

The present invention relates generally to methods for computing 
occurrences of compound events and, more particularly, to methods for computing 
occurrences of compound events from occurrences of primitive events where such 
occurrences are specified as a set of temporal intervals over which the compound 
event type is true. 

BACKGROUND OF THE INVENTION 

Event logic provides a calculus for forming compound event types as 
expressions over primitive event types. The syntax and semantics of event logic will 
be described momentarily. Event-logic expressions denote event types, not event 
occurrences. As such, they do not have truth values. Rather, they are predicates that 
describe the truth conditions that must hold of an interval for an event to occur. In 
contrast, an event-occurrence formula does have a truth value. If O is an event-logic 
expression that denotes a primitive or compound event type, and i is an interval, then 
<D@i is an atomic event-occurrence formula that is true if and only if the truth 
conditions for the event type O hold of the interval i. 

20 d>@i denotes coincidental occurrence, the fact that an occurrence of $ 

started at the beginning of i and finished at the end of i. 0@i would not hold if an 
occurrence of O did not precisely coincide with i, but instead overlapped with i. Event 
types have internal temporal structure that render this distinction important. In the 
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case of primitive event types, that structure is simple. Each primitive event type is 
derived from a static predicate. A primitive event type O holds of an interval if the 
corresponding static predicate <f> holds of every instant in that interval. This means 

that @ i) and-.0 @ n might have different truth values. For example, if (j> is true 

5 of every instant in [0,2) and false of every other instant, then -n(0 @[1,3)) is true while 

— 10 @[1, 3) is false. Event logic takes coincidental occurrence to be a primitive 
notion. As will be demonstrated below, overlapping occurrence is a derived notion 
that can be expressed in terms of coincidental occurrence using compound event-logic 
expressions. 

10 Two auxiliary notions are needed to define the syntax and semantics of 

fn event logic. First, there are thirteen possible relations between two intervals. These 

^| relations are denoted as =, <, >, m, mi, o, oi, s, si, f, fi f d, and di and referred to 

\* collectively as Allen relations throughout this disclosure. Second, the span of two 

%l intervals i and j, denoted Span (ij), is defined as the smallest super-interval of both i 

=P 15 andj. 

f 3 The syntax of event logic is defined as follows. We are given finite 

^ disjoint sets of constant and variable symbols along with a finite set of primitive 

event-type symbols, each of a specified arity. Constant symbols, such as red-block 
and hand, denote objects while primitive event-type symbols, such as Supports, 
20 denote parameterized primitive event types. An atomic event-logic expression is a 
primitive event-type symbol of arity n applied to a sequence of n constants or 
variables. For example, Supports (green-block, x). An event-logic expression is 
either an atomic event-logic expression or one of the compound event-logic 
expressions -■O, O v VxO, 3xO,<X> ^or O^O, where <X> and *P areevent- 
25 logic expressions, x is a variable, and 

{=, <, >, m, mi, o, oi, s, si, f, fi, d, di}. 
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Informally, the semantics of compound event-logic expressions is 
defined as follows: 

o -,<!> denotes the non-occurrence of <S>. An occurrence of -iO 
coincides with i if no occurrence of <S> coincides with L Note that @ i could be 
5 true, even if an occurrence of O overlapped with i, so long as no occurrence of O 
coincided with i. 

o$v^ denotes the occurrence of either O or *F. 

oVjcO denotes the simultaneous occurrence of O for all objects. 

o 3jc<I> denotes the occurrence of d> for some object. 

, n 10 ° O a r *F denotes the occurrence of both O and The occurrences of 

<t> and ¥ need not be simultaneous. The subscript R specifies a set of allowed Allen 
relations between the occurrences of <S> and If occurrences of O and *F coincide 
with i and j respectively, and i r j for some re R, then an occurrence of <I> *P 
coincides with the span of i and j. The special case O a {=) *F is abbreviated simply as 

15 O a^F without any subscript. O a *F describes an aggregate event where both O and 
¥ occur simultaneously. The special case O A {m} *F is also abbreviated as 0;Y. O;^ 

describes an aggregate event where an occurrence of <D is immediately followed by an 
occurrence of V F. 

o An occurrence of O^O coinciding with i denotes an occurrence of O 
20 at some other interval j such that j r i for some re R. 0 R can act as a tense operator. 
Expressions such as 0 {<} O, 0 {>} O, 0 {m| O, and 0 {mi} O specify that O happened in the 

noncontiguous past, noncontiguous future, contiguous past, or contiguous future 
respectively. The 0^ operator can also be used to derive overlapped occurrence from 
coincidental occurrence. An occurrence of 0 {^o.oi.s.si.f.fi.d.di}® coincides with i if an 
25 occurrence of O overlaps with i. I abbreviate 0 {=,o ( oi,s f si,f,fi,d t di}^ ) simply as 00 
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without any subscript. Note that while (-><!>) @ i indicates that no occurrence of O 
coincided with i, (->0<I>) @ I indicates that no occurrence of <X> overlapped with i. 

Formally, the truth of an atomic event-occurrence formula <D@ i is 
defined relative to a model. Let / be the set of all intervals. A model M is a triple 
5 (O, T, P) , where 0 is a set of objects, T is a map from constants and variables to 
objects from 0, and P is map from primitive event-type symbols of arity n to subsets 
of / x O x . . . x O . P thus maps primitive event-type symbols to relations that take an 

v v / 

n 

interval as their first argument, in addition to the remaining object parameters. 

T [x := o] denotes a map that is identical to T except that it maps the variable x to the 

10 object o. The semantics of event logic is formally defined by specifying an entailment 
relation M |= 0@ i as follows: 

o<0,r,^>H p(t v ...,t n )@i ifandonlyif (i,T(t 1 ),...,T(t n ))e P(p). 

oM |= (-.<D) @ i if and only if M |* 0@ i. 

oM [= (OvY)® iif andonlyifM[=0@iorM|= v F@L 
15 o(0,T,P) |= (VxO)@i ifandonlyif (OJ[x:=o],P) |= 0@ i for every oe O . 

o (0, T, P) f= (3jcO) @ i if and only if (0, T[x := o], P) f= 0@ i for some o e O . 

oM f= (O ^P) @ i if and only if there exist two intervals j and k such that 
i = s pan (j, k), j r k for some r e R, M |= <D@ j, and M }= ¥@ k. 

o M f= (O^O) @ i if and only if there exists some interval j such that j r i for some 
20 reflandM|=<I>@j. 

The overall goal of the event-classification component is to infer all 
occurrences of a given set of compound event types from a given set of primitive 
event occurrences. Let us define £(M,<t>) to be {ilM |=<D@i}. In principle, 
25 e(M,Q>) could by implemented as a straightforward application of the formal 
semantics for event logic as specified above. There is a difficulty in doing so, 
however. Primitive event types often have the property that they are liquid. Liquid 
events have the following two properties. First, if they are true during an interval i, 
then they are also true during any subinterval of i. Second, if they are true during two 
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overlapping intervals i and j, then they are also true during SPAN(iJ). When primitive 
event types are liquid, they will hold over an infinite number of subintervals. This 
renders the formal semantics inappropriate for a computational implementation. Even 
if one limits oneself to intervals with integral endpoints, liquid primitive event types 
5 will hold over quadratically many subintervals of the scene sequence. And a 
straightforward computational implementation of the formal semantics would be 
inefficient. A central result of this disclosure is a novel representation, called 
spanning intervals, that allows an efficient representation of the infinite sets of 
subintervals over which liquid event types hold along with an efficient inference 
10 procedure that operates on that representation. This representation, and the inference 
procedure that implements £ (M,0) , are presented below. 

O SUMMARY OF THE INVENTION 

12 Therefore it is an object of the present invention to provide a method 

for computing all occurrences of a compound event from occurrences of primitive 
15 events which overcomes the problems associated with the prior art methods. 
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Accordingly, a method for computing all occurrences of a compound 
event from occurrences of primitive events is provided where the compound event is a 
defined combination of the primitive events. The method comprises the steps of: (a) 
defining primitive event types; (b) defining combinations of the primitive event types 

20 as a compound event type; (c) inputting the primitive event occurrences, such 
occurrences being specified as the set of temporal intervals over which a given 
primitive event type is true; and (d) computing the compound event occurrences, such 
occurrences being specified as the set of temporal intervals over which the compound 
event type is true, where the set of temporal intervals in steps (c) and (d) are specified 

25 as smaller sets of spanning intervals, each spanning interval representing a set of 

intervals. Preferably, the spanning intervals take the form a [ y [i, j] s , e [kj] c ] fi , where 

a, /?,/,£,€, and £ are Boolean values, i,j, Jc, and / are real numbers, 
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J r l>\ j] s >e [k,l\]p represents the set of all intervals a [p,q]p, where / < y p < s j and 
k< € q<£ I , a [p, q\ p represents the set of all points r, where p< a r< fi q, and x < e y 
means x < y when 6 is true and x < y when 6 is false. 

The methods of the present invention provide an efficient 
5 implementation of € (Af ,0) along with six subroutines used by e (M,0) , namely 
< i > , i i H i2,-ii, SPAN(ii ? i 2 ), S)(r, i), and fl(i,rj). 

Also provided are a computer program product for carrying out the 
methods of the present invention and a program storage device for the storage of the 
computer program product therein. 

| 10 BRIEF DESCRIPTION OF THE DRAWINGS 

™ These and other features, aspects, and advantages of the methods of the 

present invention will become better understood with regard to the following 
v p description, appended claims, and accompanying drawings where: 

FIG. 1 illustrates a flowchart of a preferred implementation of the 
I J 15 methods steps of the present invention; 

f ^ 

FIG. 2 illustrates a flowchart of the structural induction process used to 
implement step 108 of FIG. 1; 

FIG. 3 illustrates the primitive event types used by the computer- 
system implementation of one application of the methods of the present invention as 
20 discussed in the "Example" section below; 

FIG. 4 illustrates the lexicon of compound event types used by the 
computer-system implementation of one application of the methods of the present 
invention as discussed in the "Example" section below; 
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FIGS. 5 A and 5B illustrate sequences of video frames depicting a pick 
up and put down event respectively, wherein the results of performing segmentation, 
tracking, and model reconstruction have been overlayed on the video frames; 

FIGS. 6 A and 6B illustrate the output of the event-classification 
methods of the present invention applied to the model sequences from FIGS. 5 A and 
5B, respectively; 

FIGS. 7A, 7B, 7C, 7D, and 7E illustrate sequences of video frames 
depicting stack, unstack, move, assemble and disassemble events, wherein the results 
of performing segmentation, tracking, and model reconstruction have been overlayed 
on the video frames; 

FIGS. 8A, 8B, 8C, 8D, and 8E illustrate the output of the event- 
classification methods of the present invention applied to the model sequences from 
FIGS. 7A, 7B, 7C, 7D, and 7E, respectively; 

FIGS. 9A, 9B, 9C, and 9D illustrate sequences of video frames 
depicting: a pick up event from the left instead of from the right; a pick up event with 
extraneous objects in the field of view; a sequence of a pick up event followed by a put 
down event followed by another pick up event followed by another put down event; 
and two simultaneous pick up events, respectively, wherein the results of performing 
segmentation, tracking, and model reconstruction have been overlayed on the video 
frames; 

FIGS. 10A, 10B, IOC, and 10D illustrate the output of the event- 
classification methods of the present invention applied to the model sequences from 
FIGS. 9A, 9B, 9C, and 9D, respectively; 

FIGS. 1 1 A and 1 IB illustrate sequences of video frames depicting non- 
events, wherein the results of performing segmentation, tracking, and model 
reconstruction have been overlayed on the video frames; and 



FIGS. 12A and 12B illustrate the output of the event-classification 
methods of the present invention applied to the model sequences from FIGS. 1 1 A and 
11B, respectively. 



5 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

Referring now to FIG. 1, there is illustrated a flowchart presenting a 
general overview of the method steps for computing all occurrences of a compound 
event from occurrences of primitive events, where the compound event is a defined 
combination of the primitive events. The method comprises defining primitive event 
10 types at step 102. At step 104 combinations of the primitive event types are defined as 
a compound event type. At step 106, the primitive event occurrences are input, such 
occurrences being specified as the set of temporal intervals over which a given 
primitive event type is true. Lastly, at step 108, the compound event occurrences are 
computed. Such occurrences being specified as the set of temporal intervals over 
" I 15 which the compound event type is true, where, the set of temporal intervals in steps 

v3 106 and 108 are specified as smaller sets of spanning intervals, each spanning interval 

representing a set of intervals. 

A detailed explanation of the general method steps in FIG. 1 will now 

be discussed. 

20 INTERVALS 

One might try to implement event logic using only closed intervals of 
the form [q 9 r]> where q < r. Such a closed interval would represent the set 
{p I q < p < r) of real numbers. With such closed intervals, one would define the 
Allen relations as follows: 

25 
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15 One difficulty with doing so is that it would possible for more than one 

Allen relation to hold between two intervals when one or both of them are 
instantaneous intervals, such as [q 9 q]. Both m and S and would hold between [q, q] 
and [q 9 r], both mi and si would hold between [q, r] and [q, q], both m and fi would 
hold between [q, r] and [r, r], both mi and f would hold between [r, r] and [q, r], and 

20 =, m, and mi would all hold between [q, q] and itself. To create a domain where 
exactly one Allen relation holds between any pair of intervals, let us consider both 
open and closed intervals. The intervals (q, r], [q, r), and (q, r), where q < r, represent 
the sets {p I q < p < r) , [p I q < p < r} and {p I q < p < r} of real numbers 
respectively. The various kinds of open and closed intervals can be unified into a 

25 single representation a [q 9 r] fi , where a and p are true or false to indicate the interval 

being closed or open on the left or right respectively. To do this, let us use q < a r to 
mean q < r when a is true and q < r when a is false. Similarly, let us use q > a r to 
mean q>r when a is true and q>r when a is false. With these, a [q 9 r] fi 



9 



represents the set {p I q < a p < p r] of real numbers. Given this, one can define the 
Allen relations as follows: 
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20 With the above definitions, exactly one Allen relation holds between 

any pair of intervals. 

The set of real numbers represented by an interval is referred to as its 
extension. Given the above definition of interval, any interval, such as [5, 4], (5, 4], 
[5, 4), or (5, 4), where the upper endpoint is less than the lower endpoint represents the 
25 empty set. And any open interval, such as [5, 5), (5, 5], or (5, 5), where the upper 

endpoint equals the lower endpoint also represents the empty set. To create a situation 
where the extension of each interval has a unique representation, let us represent all 
such empty sets of real numbers as { }. Thus whenever we represent an interval 
a [#> r ]p explicitly, it will have a nonempty extension and will satisfy the following 

30 normalization criterion: q < a » r . 
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SPANNING INTERVALS 

When using event logic, we wish to compute and represent the set / of 
all intervals over which some event-logic expression O holds. Many primitive event 
types, including all of the primitive events types used in the computer system 
5 implementation of the application described in the "Example" section below, are 

liquid in the sense that if some event holds of an interval then that event holds of every 
subinterval of that interval. With real-valued interval endpoints, this creates the need 
to compute and represent an infinite set of intervals for a liquid event. Even limiting 
ourselves to integer-valued interval endpoints, a liquid event will require the 
10 computation and representation of quadratically many intervals. 

To address this problem, let us introduce the notion of a spanning 
interval. A spanning interval [i : j] represents the set of all subintervals of [/, j] , in 
other words { [q , r] I i < q < j a / < r < j) . Similarly (i : ;] , [/ : j) , and (i : j) 
represent {(q,r]\i <q< j a/ <r<j} , {[q,r)\i<q< j j\i<r< and 

15 { (q, r) \i < q < j a i < r < j } respectively. What we desire is to use spanning intervals 
to represent the set of all intervals over which the primitive event types hold and to 
compute and represent the set of all intervals over which compound event types hold 
via structural induction over the compound event-logic expressions. A problem arises 
however. Given two liquid event types O and the compound event type O; ¥ is not 

20 liquid. If O holds over [/ : j) and Y holds over [j : k) , then O; Y might not hold over 
every subinterval of [i,k) . It holds over only those subintervals that include Such 
event types are referred to as semi liquid. Since spanning intervals are not sufficient to 
efficiently represent semi-liquid events, let us extend the notion of a spanning interval. 
A spanning interval represents the set of intervals 

25 {[q,r]\i<q< j Ak<r<l}. Similarly the spanning intervals 
(VJUkJ)l [VJUkJ]), and ([iJUkJ]) represent the sets 
{(q,r]\i<q< ]Ak<r<l], {[q,r)\i<q< ; a < r < /} , and 
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{(q, r) I i < q < j a k < r < 1} respectively. This extended notion of spanning interval 
subsumes the original notion. The spanning intervals [i: j], (i: j], [i: j),and (i: j) 
can be represented as the spanning intervals [[/, j] , [/, j]] , ([i, j] , [i, j]] , [[/, j] , [/, j]) , 
and ([/, j] , [/, j]) respectively. For reasons that will become apparent below, it is 
5 necessary to also allow for spanning intervals where the ranges of endpoint values are 
open. In other words, we will need to consider spanning intervals like [(/, j] , [kj]] to 
represent sets like { [q, r] I i < q < j a k < r < 1} . All told, there are six endpoints that 
can independently be either open or closed: q, r, /, j, k, and /, yielding sixty four kinds 
of spanning intervals. These can all be unified into a single representation, 
10 a [ r [iJ]s>A k J]c]p> where and ( are true or false if the endpoints q, r, i,j, 

k t and / are closed or open respectively. More precisely, the spanning interval 
alrlh j]s* [kJ] c ] p represents the set 



of intervals. The set of intervals represented by a spanning interval is referred to as its 
15 extension. Moreover, a set of spanning intervals will represent the union of the 

extensions of its members and the empty set of spanning intervals will represent the 
empty set of intervals. The set of intervals represented by a set of spanning intervals is 
further referred to as its extension. A key result of this disclosure is that if the set of 
all intervals over which some set of primitive event types hold can be represented as 
20 finite sets of spanning intervals then the set of all intervals over which all event types 
that are expressible as compound event-logic expressions over those primitives hold 
can also be represented as finite sets of spanning intervals. 



{ a [q,r] fi \i< r q< s jAk< e r<zl} 



(14) 



NORMALIZING SPANNING INTERVALS 
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While we require that all intervals have finite endpoints, we allow 
spanning intervals to have infinite endpoints, for instance, [[-<», y], [kj]]. Such 
spanning intervals with infinite endpoints represent sets of intervals with finite 
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endpoints but where the range of possible endpoints is unconstrained from above or 
below. 

Just as we desire that the extension of every interval have a unique 
representation, we also desire that the extension of every spanning interval have a 
5 unique representation. There are a number of situations where two different spanning 
intervals will have the same extension. First, all spanning intervals 
al r U> j]s> e [kJ] { ] fi , where i = oo, j = -oo, k = oo s or / = -oo represent the empty set of 

intervals. Because there are no intervals with an endpoint that is less than or equal to 
minus infinity or greater than or equal to infinity. Second, if i = j = <*>, k = -oo , or 
10 / = -oo , the value of y,S,e , or £ does not affect the denotation respectively. Because 
there are no intervals with infinite endpoints. Third, if j > /, j can be decreased as far 
as / without changing the denotation. Because all intervals where the upper endpoint 
is less than the lower endpoint equivalently denote the empty interval. Similarly, if 
k<i,k can be increased as far as / without changing the denotation. Fourth, all 
15 spanning intervals where i> j or k > I represent the empty set of intervals. Because 
the range of possible endpoints would be empty. Fifth, all spanning intervals where 
/ = j and either y or S is false (indicating an open range for the lower endpoint) 

S : ? 

!!{ represent the empty set of intervals. Because the range of possible endpoints would be 

□ empty. Similarly, all spanning intervals where k = / and either e or ^ is false 

U 

20 (indicating an open range for the upper endpoint) also represent the empty set of 
intervals. Sixth, all spanning intervals where / = I and either a or fl is false 
(indicating an open interval) also represent the empty set of intervals. Because the 
endpoints of an open interval must be different. Seventh, if j = l and ^ is false, the 
value of S does not affect the denotation. Because if j = l and ^ is false, the upper 

25 endpoint must be less than / and the lower endpoint must be less than or equal to j 

which equals /, so the lower endpoint must be less than j. Similarly, if k = i and y is 
false, the value of e does not affect the denotation. Eighth, if j = l and either a or 
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/3 is false, the value of S does not affect the denotation. Because the lower endpoint 
of an open interval must be less than its upper endpoint. Similarly, if k = i and either 
a or p is false, the value of e does not affect the denotation. 

To create a situation where the extension of every spanning interval has 
5 a unique representation, let us represent all empty sets of intervals as { } . And when 
the values of iJ 9 k 9 l 9 a,fi,y,S,e, or £ can be changed without changing the 
denotation, we will select the tightest such values. In other words, false values for the 
Boolean parameters, maximal values for the lower bounds, and minimal values for the 
upper bounds. Thus whenever we represent a spanning interval a [ y [i, j] s , e [kj]^ 

10 explicitly, it will have a nonempty extension and will satisfy the following 
normalization criterion: 



(1) i ^ oo A j & — oo A k ^ oo A / * -oo A 

15 (2) (i = -oo ^ -,7) a ( y = °° — > a (& = -00 -> -,e) a (/ = 00 -> A 

(3) j<lAk>iA 

(4) i<jAk<lA 

(5) (/ * j v 7 a S) a (k * Ive a£) a 

(6) (/ * / v a a /?) a 

20 (7) [(j = / A -> -.5] A [(* = I A -i/) -> — 1€] A 

(8) {[j = lA(^av^fi)]^^S}A{[k = iA(^av^P)]^^e} 



Criteria (1) through (8) correspond to points one through eight above. 

25 A spanning interval a [ y [i, j] s , G [k 9 l\] p is normalized if 

/, j, k, /, a, J3, 7, S 9 e , and ^ cannot be changed without changing its denotation. Given 
a (potentially non-normalized) spanning interval i, its normalization <i> is the smallest 
set of normalized spanning intervals that represents the extension of i. One can 
compute <i> as follows: 
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where / = min(y,/) 
k f - max(fc,/) 
y=y A ii± -co 

d'- S Amin(y,/) a(J < I a or a /?) 
e' =g Amax(A:,/) * -oo a (fc > / v 7 a f3 a a) 

when /< /a/:'</a 

[/=/ -> ( / a £')] a [*' = Z -> (e' aOI a 

[/ = /^(arA/?)]A 

i ji 00 a / ^ -00 a ^ 00 a / ^ —00 



In 



otherwise 



An important property of spanning intervals is that for any spanning 
interval i, <i> contains at most one normalized spanning interval. 



COMPUTING THE INTERSECTION OF TWO NORMALIZED SPANNING 
INTERVALS 

Given two normalized spanning intervals ii and i2, their intersection 
iifli2 is a set of normalized spanning intervals whose extension is the intersection of 
the extensions of ii and 12. One can compute iiPli 2 as follows: 
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<„_ [ r [max(i, , i 2 ), min( j, , y 2 )],y , e [max(fc, , fc 2 ), min(/, , / 2 )] f ] A > 
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{ } otherwise 



An important property of normalized spanning intervals is that for any 
two normalized spanning intervals \\ and i 2 , iifli 2 contains at most one normalized 
spanning interval. 

The intuition behind the above definition is as follows. All of the 
intervals in the extension of a spanning interval are of the same type, namely, 
[<l>r]> 0?> r L [<1S), or (q,r). The intersection of two spanning intervals has a nonempty 
extension only if the two spanning intervals contain the same type of intervals in their 
extension. If they do, and the sets contain intervals whose lower endpoint is bound 
from below by i, and i 2 respectively, then the intersection will contain intervals 
whose lower endpoint is bound from below by both i, and i 2 . The resulting bound is 
open or closed depending on which of the input bounds is tighter. Similarly for the 
upper bound on the lower endpoint and the lower and upper bounds on the upper 
endpoint. 
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COMPUTING THE COMPLEMENT OF A NORMALIZED SPANNING 
INTERVAL 

Given a normalized spanning interval i, its complement -ii is a set of 
5 normalized spanning intervals whose extension is the complement of the extension of 
i. One can compute -ii as follows: 



'u T [-~ .~w-~ >*wu ^ 

(a [j [-°°, i L r T °°] T ]/? > U 

Cb[i>] T »T h 00 . 00 ]-^) U 



10 An important property of normalized spanning intervals is that for any 

normalized spanning interval i, -ii contains at most seven normalized spanning 
intervals. 

The intuition behind the above definition is as follows. First note that 
the negation of q < a r is q >^ a r . Next note that the extension of i contains intervals 

15 whose endpoints q and r satisfy q> y iAq< s j a r > € k a r <^ I . Thus the extension 

of -ii contains intervals whose endpoints satisfy the negation of this, namely, 
q <^ y i v q jv r<^ kv r / . Such a disjunction requires four spanning 

intervals, the first four in the above definition. Additionally, if the extension of i 
contains intervals of the form [q, r] , the extension of -i i will contain all intervals not 
20 of the form [q, r] , namely, (q, r], [q, r), and (q, r) . Similarly for the cases where the 
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extension of n contains intervals of the form (q, r], [q, r), or (q, r) . This accounts for 
the last three spanning intervals in the above definition. 

We now see why it is necessary to allow spanning intervals to have 
open ranges of endpoint values. The complement of a spanning interval, such as 
5 [[>\ j],[kj]] , with closed endpoint ranges, includes spanning intervals, such as 
[- 00 , 00 ]] , with open endpoint ranges. 

COMPUTING THE SPAN OF TWO NORMALIZED SPANNING INTERVALS 

The span of two intervals ii and 12, denoted Span(Ii, 12), is the smallest 
interval whose extension contains the extensions of both ii and 12. For example, the 

10 span of (1,4) and [2,6] is (1,6]. And the span of [3,7) and (3,7] is [3,7]. More 

generally, the lower endpoint of span(ii, 12) is the minimum of the lower endpoints of 
ii and \2, And the lower endpoint of SPAisi(ii, n2) is open or closed depending on 
whether the smaller of the lower endpoints of ii and 12 is open or closed. Analogously, 
the upper endpoint of Span(Ii, 12) is the maximum of the upper endpoints of ii and 12. 

15 And the upper endpoint of Span^i, 12) is open or closed depending on whether the 
larger of the upper endpoints of ii and 12 is open or closed. More precisely, 

Span(!i, 12) can be computed as follows: 

SPAN(^fe p rJ A> ^fe 2 ,^ 

The notion of span will be used below. 

20 Let us extend the notion of span to two sets of intervals by the 

following definition: 

Span (I v I 2 )A U U Span (i,,i 2 ) 

i,€/i i 2 e/ 2 
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We will want to compute the span of two sets of intervals I\ and I 2 , 
when both 1\ and h are represented as spanning intervals. And we will also want the 
resulting span to be represented as a small set of spanning intervals. 

Given two normalized spanning intervals ii and i2, their span 

SPAN(ii, i2) is a set of normalized spanning intervals whose extension is the span of 
the extensions of ii and i2. One can compute SPAi\i(ii, i2) as follows: 

s pan Q [ y [ij , j x ]^ , €i [k x , l x ]^ ] A 9gXi [ n [i 2 , 

where y = minO P y 2 ) 
= max(£ ] ,fc 2 ) 

$ = A 7l ^ 72 ) v <A A 7i * 7 2 )1 
g = [(e, a k x > k 2 ) v (e 2 aJc x < k 2 )] 

An important property of normalized spanning intervals is that for any 
two normalized spanning intervals ii and 12, SpanCm, i2) contains at most four 
normalized spanning intervals. In practice, however, fewer normalized spanning 
intervals are needed, often only one. 

The intuition behind the above definition is as follows. Consider first 
the lower endpoint. Suppose that the lower endpoints q x and q 2 of ii and 12 are in 
ri [i x J x ] Si and 72 [i 2 J 2 ] 52 respectively. That means that i x < 7] q x < dx j x and 
i 2 < q 2 <# 2 j 2 . The lower endpoint of SPAN(ii, i2) will be q x , when q x ^q 2 , and q 2 , 
when q 2 <q x . Thus it will be q x , for all 1, < n q x < 5 min(;, , j 2 ) , and will be q 2 , for 
all i 2 < n q 2 < 5 rmn{] x J 2 ) , where S = 8 X , when j x < j 2 , and S = S 2 , when j x > j 2 . 
Thus there will be two potential ranges for the lower endpoint of SPAN(ii, i 2 ): 
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^t/pminC/p^)], and n [/ 2 ,min(;py 2 )]^ . When the lower endpoint of SPAN(ii,i 2 ) 

is taken from the former, it will be open or closed depending on whether the lower 
endpoint of ii is open or closed. When it is taken from the later, it will be open or 
closed depending on whether the lower endpoint of 12 is open or closed. Thus the 
lower endpoint of SpanOi, i2)can be either ^[^[^niinCjp^)]^ or 

[ [i 2 , minO", , j 2 )] s . Analogous reasoning can be applied to the upper endpoints. If 
the upper endpoints of ii and 12 are [fcp/J fl ] A and e2 [k 2 J 2 ]^ 2 ] A respectively, then 
there are two possibilities for the upper endpoint of spanQi, i2), namely, 
e [max(fc, , k 2 ), /, ] Cx ] A and e [max(*, ,k 2 )J 2 ] (i ] A , where e =e x , when *, > * 2 , and 

e=e 2 , when k x <k 2 . 

COMPUTING THE SOFA NORMALIZED SPANNING INTERVAL 

Given an Allen relation r and a set / of intervals, let 3) (/%/) denote the 
set J of all intervals j such that irj for some ie /. Given an Allen relation r and a 
normalized spanning interval i, let 2) (r,i) denote a set of normalized spanning intervals 
whose extension is 2) (r,/), where / is the extension of i. One can compute 3) (r,i) as 
follows: 

3(=,i) A {i} 

3(0, ^ [J*,,./,]^, a U U ff|A . Wl [^,Wf l ^ Aft «,[^»]T] A ) 
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9K oi , ax [ n [i, , 7, ]« , ei IK > z . ] fi ] A ) £ U L [ T t— > h 1 Ml* 

/>2e(T,F| 

5 ®(MJw,WM,]Ja) £ U < 



An important property of normalized spanning intervals is that for any 
10 normalized spanning interval i, 2)(r, i) contains at most 1, 4, 4, 2, 2, 4, 4, 2, 2, 2, 2, 4, 
or 4 normalized spanning intervals when r is =, <, >, m, mi, o, oi, s, si, f , fi, d, or di 
respectively. In practice, however, fewer normalized spanning intervals are needed, 
often only one. 

The intuition behind the above definition is as follows. Let us handle 
15 each of the cases separately. 

r = < For any intervals \\ and i 2 in the extensions of \ x and i 2 respectively we want 
\\ <i' 2 . From (2) we get r x q 2 . And from (14) we get k x < €i r x . 

Combining these we get k x ^_, A/W * 2Aei q 2 • In this case, both a 2 and /? 2 are free 
20 indicating that either endpoint of i 2 can be open or closed. 

r - > For any intervals ij and i 2 in the extensions of ij and i 2 respectively we want 
i; >i 2 . From (3) we get q x £^ Anft r 2 . And from (14) we get q x <^ j x . 
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Combining these we get r 2 ^ a A ^ 7i . In this case, both a 2 and /? 2 are free 
indicating that either endpoint of i 2 can be open or closed. 

r = m For any intervals \ x and i 2 in the extensions of i x and i 2 respectively we want 
ij m i 2 . From (4) we get r x - q 2 and /?, * a 2 . And from (14) we get 
k x < r x < f /, . Combining these we get k x < Gj <? 2 <^ /, and fi x ^a 2 . In this 

case, only /? 2 is free indicating that the upper endpoint of i 2 can be open or 
closed. 

r=mi For any intervals i\ and i 2 in the extensions of ij and i 2 respectively we want 
i\ mi I 2 . From (5) we get q x = r 2 and a x * fi 2 . And from (14) we get 
i x < q x <^ j x . Combining these we get i x < h r 2 <^ j x and a x * f$ 2 . In this 

case, only a 2 is free indicating that the lower endpoint of i 2 can be open or 
closed. 

^ r=0 For any intervals \ and i 2 in the extensions of \ x and i 2 respectively we want 

P \ 0 h • From (6) we get q x <^ ? 2 — ^ A0S2 1 <. AaA r 2 . And from (14) we get 

~ 5 15 /, < ^ and A:, < E| r x < fi /j . Combining these we get 

'i ^ « 2 ^A^Aft 'i and *i ^aAa £i ^ ■ In this case, both a 2 and /3 2 
are free indicating that either endpoint of i 2 can be open or closed. 

r=oi For any intervals i{ and i 2 in the extensions of i x and i 2 respectively we want 
oii' 2 . From (7) we get q 2 <^ ai q x < a ^ 2 r 2 r x . And from (14) we 

20 get r x < c l x and /, < n q x < Si j x . Combining these we get 

f i ^AftAn r 2 < AWWl h and ? 2 . In this case, both or 2 and A 

are free indicating that either endpoint of i 2 can be open or closed. 

r=S For any intervals and i 2 in the extensions of ij and i 2 respectively we want 
i{ S i 2 . From (8) we get q x = q 2 , a, = tf 2 , and r, ^ aA ^ • And from ( 14 ) we 
25 get i, < 9, <^ j, and ^ < €j r x . Combining these we get a x =a 2 , 

i, < <? 2 <^ j, , and fcj ^ AaAa€i r 2 . In this case, only /? 2 is free indicating that 
the upper endpoint of i 2 can be open or closed. 

r=si For any intervals i\ and i 2 in the extensions of i x and i 2 respectively we want 
i{ si i 2 . From (9) we get = q 2 , a, = or 2 , and r x >: AW ^ h . And from (14) we 
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get i, < 4, <^ j x and r, < fi /, . Combining these we get a x -a 2 , 

i, < q 2 < 5i j x , and r 2 ^ Aa ^ 2A ^ /i . In this case, only /? 2 is free indicating that 

the upper endpoint of i 2 can be open or closed. 

r=f For any intervals i\ and i 2 in the extensions of i x and i 2 respectively we want 
i; f i 2 . From (10) we get q x >^ ai q 2 , r x =r 2 , and J3 x =fi 2 . And from (14) 
we get k x < r x <^ l x and q x <^ j x . Combining these we get f3 x = fi 2 , 
k x < ei r 2 < c l x , and # 2 ^ fflAfliA ^ ^ • In this case, only a 2 is free indicating that 
the lower endpoint of i 2 can be open or closed. 

r=fi For any intervals i[ and i 2 in the extensions of i x and i 2 respectively we want 
i; fi i' 2 . From (1 1) we get q x > a ^ a% q 2 , r x =r 2 , and fi x =J3 2 . And from (14) 
we get k x < r x < c /, and /, < n q x . Combining these we get fi x = p 2 , 

< r 2 < f| /j , and /, ^ ai ^ a2A7l q 2 . In this case, only a 2 is free indicating that 
the lower endpoint of i 2 can be open or closed. 

r=d For any intervals i\ and i 2 in the extensions of i x and i 2 respectively we want 
i\ di' 2 . From (12) we get ^ >^ A „ 2 # 2 and /; <^ A/?2 r 2 . And from (14) we 
get q x < Si j x and k x < Gi r, . Combining these we get q 2 <^ iA „ 2/ ^ j x and 
*i --.aaAas, r 2 • I 11 case > both a 2 and /? 2 are free indicating that either 
endpoint of i 2 can be open or closed. 

r=di For any intervals \ x and i 2 in the extensions of i x and i 2 respectively we want 
i; di i' 2 . From (13) we get <„ iA ^, ^ 2 , and r x > AW3 , r 2 . And from (14) we 
get i x < Yx q x and r x < Ci l x . Combining these we get i x < aj/WWl q 2 and 
h -AAnftAf, 'i • 111 th * s case ' ^ oth a 2 anc * A> are free indicating that either 
endpoint of i 2 can be open or closed. 

COMPUTING THE 3 OF TWO NORMALIZED SPANNING INTERVALS 

Given an Allen relation r and two sets / and / of intervals, let 3 (/,r,J) 
denote the set K of all intervals k such that k = Span(b j) for some ie / and je /, 
where irj. Given an Allen relation r and two normalized spanning intervals i and j, let 
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#(i,r j) denote a set of normalized spanning intervals whose extension is 3(I,r,J), where 
/ and J are the extensions of i and j respectively. One can compute S(i,r j) as follows: 



<J(i,rj)A U U U U SpanGT) 

ieS)(r- l ,j) iei'Hi JeS&(r,i) j'ejTlj 

5 

It is easy to see that 1 r, •) I < 4 1 2)(r, •) I 2 . Thus an important property of normalized 
spanning intervals is that for any two normalized spanning intervals I and j, #(i,rj) 
contains at most 4, 64, 64, 16, 16, 64, 64, 16, 16, 16, 16, 64, or 64 normalized 
spanning intervals, when r is =, <, >, m, mi, o, oi, s, si, f , fi, d, or di respectively. 
10 While simple combinatorial enumeration yields the above weak bounds on the number 
of normalized spanning intervals needed to represent 4(i,r j), in practice, far fewer 
normalized spanning intervals are needed, in most cases only one. 

The intuition behind the above definition is as follows. Let / and J be 
the extensions of i and j respectively. The extension of the set of all i' is the set of all 
15 intervals i such that irj for some j in 7. And the extension of the set of all i" is the set 
of all intervals i in / such that irj for some j in /. Similarly, the extension of the set of 
all j' is the set of all intervals j such that irj for some i in /. And the extension of the 
set of all j" is the set of all intervals j in / such that irj for some i in /. Thus the 
extension of the set of all SPAN(i" j") is the set of all intervals k such that 

20 k = Span(I j) where i is in /, j is in 7, and irj. 

AN EFFICIENT INFERENCE PROCEDURE FOR EVENT LOGIC 

Given the above procedures for computing ( i) , ii fl h,— A, 
Span ( ii, i 2 ), 2)(r, i), and #(i,rjj), one can now define a procedure for computing 
£(M,0) . This procedure takes a model M along with an event-logic expression <E> 
25 and computes a set of normalized spanning intervals that represents the set I of 

intervals i for which <I> @i is true. The model M is a set of atomic event-occurrence 
formulae of the form p(ci,...,c n )@i, where /?(ci,...,c„) is a ground primitive event-logic 
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expression and i is a normalized spanning interval. A model entry p(c\,. . .,c„)@i 
indicates that the primitive event /?(ci,...,c n ) occurred during all intervals in the 
extension of i. C(A/)denotes the set of all constants in all ground primitive event-logic 
expressions in M. 

£(M,p(c l9 ... 9 c H )) A {i\p(c l9 ... 9 c n )®ieM} 
£(Af,*v*F) A £(M,S>)U£(Af,Y) 
f(Af,Vx«) A (J - U hO-PlK 

where C(M) = {c 1? ...,c n } 
f(M 3 3xO) A (J e(MMx:=c]) 

— ceC(M) 

e{M,-&) a u ••• u i;n-ni'„ 

where £(M,4>) = {i,,...,i„} 
£(M,<f> a r *F) A U U U^(i,r,j) 

— iG£(M,<D) jee(M,4')re/f 

£(M,<VD) A U U3(r,i) 



The procedure performs structural induction on 4> as set forth in more 
detail in FIG. 2. It computes a set of normalized spanning intervals to the represent 
the occurrence of each atomic event-logic expression in O and recursively combines 
the sets so computed for each child subexpression to yield the sets for each parent 
subexpression. An important property of this inference procedure is that for any finite 
model M, € (M,0) , the set / of intervals i for which <E> @i is true, can be represented 
by a finite set of normalized spanning intervals. Nominally, the number of normalized 
spanning intervals in £(M,<J>) can be exponential in the subexpression depth of O 
because each step in the structural induction can introduce a constant factor growth in 
the size of the set. However, in practice, such exponential growth does not occur. 

EXAMPLE 

The methods of the present invention have been implemented as a 
computer system for recognizing events in video sequences. However, the recognition 
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of events in video sequences is given by way of example only and not to limit the 
scope or spirit of the present invention. Those skilled in the art will realize that there 
are many other applications for the methods of the present invention. 

The computer system takes short (typically 30 to 120 frame) video 
sequences as input. These video sequences depict a person performing various actions 
with colored blocks, such as pick up, put down, stack, unstack, move, assemble, and 
disassemble. A video sequence can depict no defined action, one defined action, or 
multiple defined actions. Multiple defined actions may be sequential and/or 
simultaneous and may be the same action or different actions. The computer system 
labels each video sequence with the actions that are being performed as well as the 
particular interval in the video sequence during which those actions were performed. 

This computer system uses the methods of the present invention to 
perform event classification. FIG. 3 shows the primitive event types used by this 
computer system. The intervals in the input video sequences during which these 
primitive event types hold are computed using techniques in the prior art. These 
intervals, however, are represented as spanning intervals introduced by the present 
invention. This specification of the primitive event types and the mechanism for 
computing the intervals corresponding to their occurrence in input video sequences 
constitutes an application of steps 102 and 106 as shown in FIG. 1. FIG. 4 shows the 
compound event types used by this computer system. These compound event types 
are specified as event-logic expressions over the primitive event types from FIG. 3. 
This specification of the compound event types constitutes an application of step 104 
as shown in FIG. 1 . 

Pickup (x, y, z) denotes an event type where x picks y up off of z- It 
is specified as a sequence of three intervals, where x is not attached to and does not 
support y in the first interval but is attached to and does support y in the third interval. 
And z supports y in the first interval but does not support y in the third interval. 
Additionally, several conditions must hold in both the first and third intervals: x must 
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be unsupported, y must not support either z or x, x and z must not support each other, 
and y must not be attached to z. During the second interval, intermediate between the 
first and third intervals, either x is attached to y or y is attached to z. Additionally, 
several conditions must hold throughout the entire event: x, y, and z must be distinct 
and y must be supported, putdown (x, y, z) denotes an event type where x puts y 
down on z. It is specified in a fashion that is similar to Pickup (x, y, z) but where the 
three subevents occur in reverse order. Stack (w, x, y, z) denotes an event type 
where w puts x down on y which is resting on z. It is specified as Putdown (w, x, y), 
where z supports but is not attached to y and z is distinct from w, x, and y. 
un stack (w, x, y, z) denotes an event type where w picks x up off of y which is 
resting on z. It is specified as pickup (w, x, y), where z supports but is not attached to 
y and z is distinct from w, x, and y. Move (w, x, y, z) denotes an event type where w 
picks x up off of y and puts it down on z which is distinct from y. 
Assemble (w, x, y, z) denotes an event type where w first puts y down on z then 
sometime later stacks x on top of y. Finally, disassemble (w, x, y, z) denotes an 
event type where w first unstacks x from on top of y (which is resting on z) and then 
sometime later picks y up off of z. FIGS. 5 A and 5B show sample movies depicting 
occurrences of the event types pickup (x, y, z) and Putdown (x, y, z), respectively. 
FIGS. 7 A - 7E show sample movies depicting occurrences of the event types 
Stack (w>, x, y, z), Unstack (w, x, y, z), move (w, x, y, z), assemble (w, x, y, z), 
and Disassemble (w, x, y, z), respectively. 

Nominally, all atomic event-logic expressions are primitive event 
types. However, we allow giving a name to a compound event-logic expression and 
using this name in another event-logic expression as short hand for the named 
expression with appropriate parameter substitution. This is simply a macro-expansion 
process and, as such, no recursion is allowed. This feature is used in FIG. 4 to define 
unstack, move, and disassemble in terms of pickup, Stack, move, and 
assemble in terms of Stack, which is itself defined in terms of putdown, and 
Disassemble in terms of unstack, which is itself defined in terms of Pickup. 
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The methods of the present invention have been applied to compute the 
occurrences of the compound event types from FIG. 4 from the occurrences of the 
primitive event types from FIG. 3 as recovered from a number of input video 
sequences. This constitutes an implementation of step 108 as shown in FIG. 1. FIGS. 
5 6, 8, 10, and 12 show both the primitive event occurrences that have been recovered 
from the input video sequences in FIGS. 5, 7, 9, and 1 1 using methods in the prior art 
as well as the compound event occurrences that have been recognized out of the 
primitive event occurrences using the methods of the present invention. FIGS. 5 A, 
5B, and 7A-7E show sample movies that depict the seven compound event types pick 

10 up, put down, stack, unstack, move, assemble, and disassemble respectively. The 
results of applying segmentation, tracking, and model reconstruction methods of the 
prior art on these video sequences are shown overlayed on the video sequences. FIGS. 
6 A, 6B, and 8A-8E show the results of applying the event classification methods of 
the present invention on these movies. These figures show that the computer system 

15 implementation of the methods of the present invention correctly recognized the 
intended event class for each movie. 

In FIG. 5A, frames 0 through 1 correspond to the first subevent of a 
pick up event, frames 2 through 13 correspond to the second subevent, and frames 14 
through 22 correspond to the third subevent. In FIG. 5B, frames 0 through 13 

20 correspond to the first subevent of a put down event, frames 14 through 22 correspond 
to the second subevent, and frames 23 through 32 correspond to the third subevent. 
The computer system correctly recognized these as instances of pick up and put down 
respectively. In FIG. 7A, frames 0 through 11,12 through 23, and 24 through 30 
correspond to the three subevents of a put down event. The computer system correctly 

25 recognized this as a put down event and also as a stack event. In FIG. 7B, frames 0 
through 10, 1 1 through 24, and 25 through 33 correspond to the three subevents of a 
pick up event. The computer system correctly recognized this as a pick up event and 
also as an unstack event. In FIG. 7C, frames 0 through 8, 9 through 16, and 17 through 
45 correspond to the three subevents of a pick up event and frames 17 through 33, 34 
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through 45, and 46 through 52 correspond to the three sube vents of a put down event. 
The computer system correctly recognized the combination of these two events as a 
move event. In FIG. 7D, frames 18 through 32, 33 through 40, and 41 through 46 
correspond to the three subevents of a put down event and frames 57 through 67 and 
68 through 87 correspond to the first and third subevents of a second put down event, 
with the second subevent being empty. The latter put down event was also correctly 
recognized as a stack event and the combination of these two events was correctly 
recognized as an assemble event. In FIG. 7E, frames 0 through 18, 19 through 22, and 
23 through 50 correspond to the three subevents of a pick up event and frames 23 
through 56, 57 through 62, and 63 through 87 correspond to the three subevents of a 
second pick up event. The former pick up event was also correctly recognized as an 
unstack event and the combination of these two events was correctly recognized as a 
disassemble event. These examples show that the computer system correctly 
recognized each of the seven event types with no false positives. 

As discussed in the introduction, using force dynamics and event logic 
to recognize events offers several advantages over the prior art of using motion profile 
and hidden Markov models. These advantages are: 

(1) robustness against variance in motion profile; 

(2) robustness against presence of extraneous objects in the field of 
view; 

(3) ability to perform temporal and spatial segmentation of events; and 

(4) ability to detect non-occurrence of events. 

FIGS. 9A - 9D and 10A - 10D illustrate the first three of these 
advantages while FIGS. 1 1A, 1 IB, 12A, and 12B illustrate the third. FIG. 9A shows a 
pick up event from the left in contrast to FIG. 5A which is from the right. Even 
though these have different motion profiles, FIG. 10A shows that the computer system 
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correctly recognized that these exhibit the same sequence of changes in force-dynamic 
relations and constitute the same event type, namely pick up. FIG. 9B shows a pick up 
event with two extraneous blocks in the field of view. FIG. 10B shows that the 
computer system correctly recognized that these extraneous blocks do not participate 
5 in any events and, despite their presence, the truth conditions for a pick up event still 
hold between the other objects. FIG. 9C shows a pick up event, followed by a put 
down event, followed by another pick up event, followed by another put down event. 
FIG. IOC shows that the computer system correctly recognizes this sequence of four 
event occurrences. FIG. 9D shows two simultaneous pick up events. FIG. 10D shows 

10 that the computer system correctly recognized these two simultaneous event 

occurrences. Finally, FIGS. 1 1 A and 1 IB show two non-events. FIGS. 12A and 12B 
show that the computer system is not fooled into thinking that these constitute pick up 
or put down events, even though portions of these events have similar motion profile 
to pick up and put down events. Therefore, the computer system correctly recognizes 

15 that these movies do not match any known event types. 



+ The methods of the present invention are incorporated in a 

s comprehensive implemented system for recovering event occurrences from video 

:f t input. It differs from prior approaches to the same problem in two fundamental ways. 

Uj It uses state changes in the force-dynamic relations between objects, instead of motion 

20 profile, as the key descriptive element in defining event types. And it uses event logic, 
instead of hidden Markov models, to perform event classification. One key result of 
the methods of the present invention is the formulation of spanning intervals, a novel 
efficient representation of the infinite sets of intervals that arise when processing 
liquid and semi-liquid events. A second key result is the formulation of an efficient 
25 procedure, based on spanning intervals, for inferring all occurrences of compound 
event types from occurrences of primitive event types. The techniques of force- 
dynamic model reconstruction, spanning intervals, and event-logic inference have 
been used to successfully recognize seven event types from real video: pick up, put 
down, stack, unstuck, move, assemble, and disassemble. Using force-dynamics and 
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event logic to perform event recognition offers four key advantages over the prior art 
of using motion profile and hidden Markov models. First, it is insensitive to variance 
in the motion profile of an event occurrence. Second, it is insensitive to the presence 
of extraneous objects in the field of view. Third, it allows temporal segmentation of 
5 sequential and parallel event occurrences. And fourth, it robustly detects the non- 
occurrence of events as well as their occurrence. 

As discussed above, the methods of the present invention are 
particularly suited to be carried out by a computer software program, such computer 
software program preferably containing modules corresponding to the individual steps 
10 of the methods. Such software can, of course, be embodied in a computer-readable 
medium, such as an integrated circuit or a peripheral device. 



While there has been shown and described what is considered to be 
f: preferred embodiments of the invention, it will, of course, be understood that various 

01 modifications and changes in form or detail could readily be made without departing 

fU 

ji 15 from the spirit of the invention. It is therefore intended that the invention be not 

limited to the exact form described and illustrated, but should be constructed to cover 
£3 all modifications that may fall within the scope of the appended claims. 
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